Formant diphone parameter extraction utilising a labelled single-speaker database

نویسنده

  • Robert H. Mannell
چکیده

This paper examines a method for formant parameter extraction from a labeled single speaker database for use in a formantparameter diphone-concatenation speech synthesis system. This procedure commences with an initial formant analysis of the labelled database, which is then used to obtain formant (F1-F5) probability spaces for each phoneme. These probability spaces guide a more careful speaker-specific extraction of formant frequencies. An analysis-by-synthesis procedure is then used to provide best-matching formant intensity and bandwidth parameters. The great majority of the parameters so extracted produce speech which is highly intelligible and which has a voice quality close to the original speaker.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Expressing vocal effort in con

A new diphone database with a full diphone set for each of three levels of vocal effort is presented. A theoretical motivation is given why this kind of database will be useful for emotional speech synthesis. Two hypotheses are verified in perception experiments: (I) The three diphone sets are perceived as belonging to the same speaker; (II) The vocal effort intended during database recordings ...

متن کامل

On the reduction of concatenation artefacts in diphone synthesis

One well-known problem with diphone concatenation is the occurrence of audible discontinuities at diphone boundaries, which are most prominent in vowels and semi-vowels. Significant formant jumps at certain boundaries suggest that the problem is of a spectral nature. We have examined this hypothesis by correlating the results of a listening experiment with spectral distances measured across dip...

متن کامل

Improving Speaker Identification Performance by Combining Vocal Tract Features

This paper proposes fusion and addition techniques of vocal tract features such as Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Mel Frequency Cepstral Coefficients (DMFCC) in speaker identification. Feature extraction plays an important role as a front end processing block in Speaker Identification (SI) process. Mel frequency features are used to extract the spectral characteristics o...

متن کامل

Optimizing Vowel Formant Measurements in Four Acoustic Analysis Systems for Diverse Speaker Groups.

PURPOSE This study systematically assessed the effects of select linear predictive coding (LPC) analysis parameter manipulations on vowel formant measurements for diverse speaker groups using 4 trademarked Speech Acoustic Analysis Software Packages (SAASPs): CSL, Praat, TF32, and WaveSurfer. METHOD Productions of 4 words containing the corner vowels were recorded from 4 speaker groups with ty...

متن کامل

Speaker conversion in ARX-based source-

A speaker conversion framework for formant synthesis is proposed. With this framework, given a small set of a target speaker’s utterances, segmental features of an original speech can be converted to those of the given speaker. Unlike other speaker conversion frameworks, further voice quality modification can also be applied to the converted speech with conventional formant modification techniq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998